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DETAILED ACTION 

1. Claims 1-29 have been considered. New claims 23-29 have been added as per 
Applicant's request. 

Papers Submitted 

2. It is hereby acknowledged that the following papers have been received and placed of 
record in the file: IDS as received 1 1 April 2006 and Amendment as received 1 1 April 2006. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

4. Claims 1-22 are rejected under 35 U.S.C. 103(a) as being unpatentable over Greenley, 
U.S. Patent No. 5,761,469 (herein referred to as Greenley) in view of Hannah et al., U.S. Patent 
Number 5,706,481 (herein referred to as Harmah). 

5. Regarding claims 1 and 14, taking claim 14 as exemplary, Greenley has taught a 
processing system comprising: 

a. A data processor (Greenley 100 of Fig. 1) comprising: 

i. An instruction execution pipeline comprising N processing stages, each of 
said N processing stages capable of performing one of a plurality of 
execution steps associated with a pending instruction being executed by 
said instruction execution pipeline (Greenley Col.l lines 34-40); 
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ii. A data cache (Greenley 1 80 of Fig. 1) capable of storing data values used 
by said pending instruction (Greenley Col.l lines 42-43); 

iii. A plurality of registers (150 of Fig.l) capable of receiving said data values 
from said data cache (Greenley Col.l lines 41-45); 

iv. A load store unit (Greenley 130 of Fig. 1) capable of transferring a first one 
of said data values fi-om said data cache to a target one of said plurality of 
registers during execution of a load operation (Greenley Col.l lines 15-21, 
63-67 and Col.2 lines 1-7, 13-15); 

V. A shifter circuit (Greenley 1 60, 1 70 of Fig. 1 ) associated with said load 
store unit capable of one of a) shifting (Greenley Col.2 lines 19-31), b) 
sign extending (Greenley CoL2 lines 48-54), and c) zero extending 
(Greenley CoL2 lines 45-47) said first data value prior to loading said first 
data value into said target register; 

b. A memory coupled to said data processor (Greenley Col. 1 lines 4 1 -43); and 

c. A plurality of memory-mapped peripheral circuits coupled to said data processor 
for performing selected functions in association with said data processor 
(Greenley Col. 1 line 29 to Col. 2 line 7 and Col. 2 lines 16-31). In regards to 
Greenley, the ICACHE, prefetch unit, and memory subsystems perform selected 
ftinctions, such as storing instructions, fetching instructions from selected 
locations, accessing words in memory, at and fi-om certain locations from 
memory. 
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6. Greenley has not explicitly taught bypass circuitry associated with said load store unit 
capable of transferring said first data value from said data cache directly to said target register 
without processing said first data value in said shifter circuit. However, Greenley has taught a 
sign extension unit (Greenley 160 of Fig. 1) that performs a fiinction to fill in unoccupied bits of a 
register by extending its sign after it is loaded from the data cache but before it is stored in the 
register file (Greenley Col.2 lines 48-50). Hannah has taught bypassing fiinctions circuitry 
capable of transferring said first data value to said target without processing said first data value 
in said shifter circuit (Hannah column 9, lines 31-67; Figure 1 1; Figure 12; Figure 13; and Figure 
14). A person of ordinary skill in the art at the time the invention was made would have 
recognized that bypasses improve the performance of a system by minimizing delays fi-om 
unnecessary fiinctions (Hannah colimin 9, lines 38-44). Therefore, it would have been obvious 
to a person of ordinary skill in the art at the time was made to incorporate the bypassing of 
Hannah in the device of Greenley to improve system performance. 

7. Claim 1 is nearly identical to claims 14, 23, and 29. Claim 1 differs from claim 14 in its 
lack of a main memory and memory-mapped peripheral circuits, but comprises the same data 
processor as claim 14, and is therefore rejected for the same reasons. Claim 23 differs in that it 
lacks the load store unit and pipeline limitations, but the rest of the limitations are similar to 
claim 1, and is therefore rejected for the same reasons. Claim 29 differs in that it lacks the load 
store unit and pipeline limitations but has memory and memory-mapped peripheral circuits, 
similar to claim 14, but the rest of the limitations are similar, and is therefore rejected for the 
same reasons. 
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8. Regarding claims 2 and 15, taking claim 15 as exemplary, Greenley in view of Hannah 
has taught the processing system as set forth in claim 14, wherein said bypass circuitry transfers 
said first data value from said data cache directly to said target register during a load word 
operation (see above rejection of claim 1). While Greenley has taught a different register size 
than the applicant (Greenley Col.2 lines 17-19), the situation when a register has no unoccupied 
bits after being loaded with data from a data cache remains the same, with the size of the register 
and word being moot. Therefore Greenley' s loading of a double word has the same 
consequences as the applicant's loading of a word. 

9. Claim 2 is nearly identical to claim 15. Claim 2 differs in its parent claim, but comprises 
the same data processor as claim 15, and is therefore rejected for the same reasons. 

10. Regarding claims 3 and 16, taking claim 16 as exemplary, Greenley in view of Hannah 
has taught the data processor as set forth in claim 15, wherein said bypass circuitry (Hannah 
column 9, lines 31-67; Figure 11; Figure 12; Figure 13; and Figure 14) transfers said first data 
value from said data cache directly to said target register at the end of two machine cycles 
(Greenley Col.4 lines 17-20). 

1 1 . Claim 3 is nearly identical to claim 16. Claim 3 differs in its parent claim, but comprises 
the same data processor as claim 16, and is therefore rejected for the same reasons. 

12. Regarding claims 4 and 17, taking claim 17 as exemplary, Greenley in view of Hannah 
has taught the data processor as set forth in claim 14, wherein said shifter circuit one of a) shifts, 
b) sign extends, or c) zero extends said first data value prior to loading said first data value into 
said target register during a load half-word operation (see Greenley Col.2 lines 17-20, 24-30, 46- 
47). 
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13. Claim 4 is nearly identical to claim 17. Claim 4 differs in its parent claim, but comprises 
the same data processor as claim 17, and is therefore rejected for the same reasons. 

14. Regarding claims 5 and 18, taking claim 18 as exemplary, Greenley in view of Hannah 
has taught the data processor as set forth in claim 17, wherein said shifter circuit loads said 
shifted first data value into said target register at the end of two machine cycles (Greenley Col. 4 
lines 17-20), but has not explicitly taught the load taking three machine cycles. However, 
Greenley has taught this two-cycle latency for all load instructions, including those without the 
need for sign extension. In the situation where the load instruction fills the target register 
completely and no sign extension is needed, such as when the data is already properly aligned 
since it is coming from the data cache, which only contains aligned data, or from the ALU, 
which outputs aligned data, the data processor as configured above will execute the load 
instruction at least one cycle faster due to the elimination of the shifter operations. This will 
create a latency of at least one cycle fewer for those load instructions which bypass the sign 
extension unit, and at least one more cycle for those which need sign extension, such as half- 
word load instructions. Because the applicant's claimed load instructions, which take 2 and 3 
cycles for bypassed and sign-extended instructions, respectively, have no claimed advantages 
over the 1 and 2 cycles that Greenly in view of Hannah have taught, but are merely a change in 
the magnitude of latency, they are considered to be equivalent and thus taught by Greenley in 
view of Hannah (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 

15. Claim 5 is nearly identical to claim 18. Claim 5 differs in its parent claim, but comprises 
the same data processor as claim 18, and is therefore rejected for the same reasons. 
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16. Regarding claims 6 and 19, taking claim 19 as exemplary, Greenley in view of Hannah 
has taught the data processor as set forth in claim 14, wherein said shifter circuit one of a) shifts, 
b) sign extends, and c) zero extends said first data value prior to loading said first data value into 
said target register during a load byte operation (see Greenley Col.2 lines 17-20, 36-40, 46-47). 

17. Claim 6 is nearly identical to claim 19. Claim 6 differs in its parent claim, but comprises 
the same data processor as claim 19, and is therefore rejected for the same reasons. 

18. Regarding claims 7 and 20, taking claim 20 as exemplary, Greenley in view of Hannah 
has taught the data processor as set forth in claim 6, wherein said shifter circuit loads said shifted 
first data value into said target register at the end of two machine cycles (Greenley Col. 4 lines 
17-20), but has not explicitly taught the transfer taking three machine cycles. However, 
Greenley has taught this two-cycle latency for all load instructions, including those without the 
need for sign extension. In the situation where the load instruction fills the target register 
completely and no sign extension is needed, such as when the data is already properly aligned 
since it is coming from the data cache, which only contains aligned data, or from the ALU, 
which outputs aligned data, the data processor as configured above will execute the load 
instruction at least one cycle faster due to the elimination of the shifter operations. This will 
create a latency of at least one cycle fewer for those load instructions which bypass the sign 
extension unit, and at least one more cycle for those which need sign extension, such as half- 
word load instructions. Because the applicant's claimed load instructions, which take 2 and 3 
cycles for bypassed and sign-extended instructions, respectively, have no claimed advantages 
over the 1 and 2 cycles that Greenly in view of Hannah have taught, but are merely a change in 
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the magnitude of latency, they are considered to be equivalent and thus taught by Greenley in 
view of Hannah (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 1955)). 

19. Claim 7 is nearly identical to claim 20. Claim 7 differs in its parent claim, but comprises 
the same data processor as claim 20, and is therefore rejected for the same reasons. 

20. Regarding claims 8, 9, 21, and 22, taking claims 21 and 22 as exemplary, Greenley in 
view of Hannah has taught the data processor as set forth in claim 14, but Greenley has not 
explicitly taught 

a. Wherein said bypass circuitry comprises a multiplexer having a first input channel 
coupled to a data output of said data cache; and 

b. Wherein said multiplexer has a second input channel coupled to an output of said 
shifter circuit. 

2 1 . However, Greenley has taught a sign extension unit (Greenley 1 60 of Fig. 1 ) that fills in 
unoccupied bits of a register by extending its sign after it is loaded from the data cache but 
before reaching a register in the register file (Greenley Col.2 lines 48-50). Hannah has taught 

a. Wherein said bypass circuitry comprises a multiplexer having a first input channel 
(Hannah column 9, lines 31-67; Figure 1 1; Figure 12; Figure 13; and Figure 14); 
and 

b. Wherein said multiplexer has a second input channel coupled to an output of 
another device (Hannah column 9, lines 31-67; Figure 11; Figure 12; Figure 13; 
and Figure 14). 

22. A person of ordinary skill in the art at the time the invention was made would have 
recognized that bypasses improve the performance of a system by minimizing delays from 
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unnecessary functions (Hannah column 9, lines 38-44). Therefore, it would have been obvious 
to a person of ordinary skill in the art at the time was made to incorporate the bypassing of 
Hannah in the device of Greenley to improve system performance. 

23. Claims 8 and 9 are nearly identical to claims 21 and 22 respectively. Claims 8 and 9 
differ in its parent claim, but comprises the same data processor as claims 21 and 22, and is 
therefore rejected for the same reasons. 

24. Regarding claim 10, Greenley has taught for use in a processor comprising an N-stage 
execution pipeline (Greenley Col.l lines 34-40), a data cache (Greenley 180 of Fig. 1), and a 
plurality of registers (Greenley 150 of Fig. 1), a method of loading a first data value from the data 
cache into a target one of the registers, the method comprising the steps of: 

a. Determining if a pending instruction in the execution pipeline is one of a load 
word operation, a load half-word operation, and a load byte operation (Greenley 
Coll lines 63-67, Col.2 lines 1-7, 17-19 and CoL5 lines 13-24). 

b. In response to a determination that the pending instruction is a load half-word 
operation, transferring the first data value from the data cache to a shifter circuit 
and shifting the first data value prior to loading the first data value into the target 
register (Greenley Col.2 lines 24-31). 

c. In response to a determination that the pending instruction is a load byte 
operation, transferring the first data value from the data cache to a shifter circuit 
and shifting the first data value prior to loading the first data value into the target 
register (Greenley Col.2 lines 35-40). 
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25. Greenley has not explicitly taught where in response to a determination that the pending 
instruction is a load word operation, transferring the first data value fi-om the data cache directly 
to the target register without processing the first data value in the shifter circuit. However, 
Greenley has taught a sign extension unit (Greenley 160 of Fig. 1) that fills in unoccupied bits of 
a register by extending its sign after it is loaded fi-om the data cache but before it is stored in a 
certain register in the register file (Greenley Col.2 lines 48-50). Hannah has taught bypassing 
sign extension (Hannah column 9, lines 31-67; Figure 11; Figure 12; Figure 13; and Figure 14). 
A person of ordinary skill in the art at the time the invention was made would have recognized 
that bypasses improve the performance of a system by minimizing delays from unnecessary 
fiinctions (Harmah column 9, lines 38-44). Therefore, it would have been obvious to a person of 
ordinary skill in the art at the time was made to incorporate the bypassing of Hannah in the 
device of Greenley to improve system performance. 

26. Regarding claim 11, Greenley in view of Hannah has taught the method as set forth in 
claim 10, wherein the step of transferring the first data value requires two machine cycles during 
a load word operation (Greenley Col.4 lines 17-20). While Greenley has taught a different 
register size than the applicant (Greenley Col.2 lines 17-19), the situation when a register has no 
unoccupied bits after being loaded with data from a data cache remains the same, with the size of 
the register and word being moot. Therefore Greenley's loading of a double word has the same 
consequences as the applicant's loading of a word {In re Rose, 220 F.2d 459, 463, 105 USPQ 
237, 240 (CCPA 1955)). 

27. Regarding claim 12, Greenley in view of Hannah has taught the method as set forth in 
claim 10, wherein the step of transferring the first data value requires two machine cycles during 
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a load half-word operation (Greenley Col.4 lines 17-20), but has not explicitly taught the transfer 
taking three machine cycles. However, Greenley has taught this two-cycle latency for all load 
instructions, including those without the need for sign extension. In the situation where the load 
instruction fills the target register completely and no sign extension is needed, such as when the 
data is already properly aligned since it is coming from the data cache, which only contains 
aligned data, or fi-om the ALU, which outputs aligned data, the data processor as configured 
above will execute the load instruction at least one cycle faster due to the elimination of the 
shifter operations. This will create a latency of at least one cycle fewer for those load 
instructions which bypass the sign extension unit, and at least one more cycle for those which 
need sign extension, such as half-word load instructions. Because the applicant's claimed load 
instructions, which take 2 and 3 cycles for bypassed and sign-extended instructions, respectively, 
have no claimed advantages over the 1 and 2 cycles that Greenly in view of Hannah have taught, 
but are merely a change in the magnitude of latency, they are considered to be equivalent and 
thus taught by Greenley in view of Hannah (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 
240 (CCPA 1955)). 

28. Regarding claim 13, Greenley in view of Hannah has taught the method as set forth in 
claim 10 wherein the step of transferring the first data value requires two machine cycles during 
a load byte operation (Greenley CoL4 lines 17-20), but has not explicitly taught the transfer 
taking three machine cycles. However, Greenley has taught this two-cycle latency for all load 
instructions, including those without the need for sign extension. In the situation where the load 
instruction fills the target register completely and no sign extension is needed, such as when the 
data is already properly aligned since it is coming fi-om the data cache, which only contains 
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aligned data, or from the ALU, which outputs aligned data, the data processor as configured 
above will execute the load instruction at least one cycle faster due to the elimination of the 
shifter operations. This will create a latency of at least one cycle fewer for those load 
instructions which bypass the sign extension unit, and at least one more cycle for those which 
need sign extension, such as half-word load instructions. Because the applicant's claimed load 
instructions, which take 2 and 3 cycles for bypassed and sign-extended instructions, respectively, 
have no claimed advantages over the 1 and 2 cycles that Greenly in view of Hannah have taught, 
but are merely a change in the magnitude of latency, they are considered to be equivalent and 
thus taught by Greenley in view of Hannah (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 
240 (CCPA 1955)). 

29. Referring to claim 24, Greenley in view of Hannah has taught 

a. The data value is transferred from the cache to the target register via the bypass 
circuit (see above rejection of claim 1). While Greenley has taught a different 
register size than the applicant (Greenley Col.2 lines 17-19), the situation when a 
register has no unoccupied bits after being loaded with data from a data cache 
remains the same, with the size of the register and word being moot. Therefore 
Greenley's loading of a double word has the same consequences as the applicant's 
loading of a word. 

b. The data value is transferred from the cache to the target register via the shifter 
circuit during a load half-word operation or a load byte operation (see Greenley 
CoL2 lines 17-20, 24-30, 46-47). 

30. Referring to claim 25, Greenley in view of Hannah has taught 
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a. The bypass circuit (Hannah column 9, Hnes 3 1-67; Figure 1 1 ; Figure 12; Figure 
13; and Figure 14) is capable of transferring the data value from the cache to the 
target register at an end of two machine cycles (Greenley CoL4 lines 17-20); and 

b. The shifter circuit is capable of providing the modified data value to the target 
register at an end of three machine cycles (Greenley CoL4 lines 17-20), but has 
not explicitly taught the load taking three machine cycles. However, Greenley 
has taught this two-cycle latency for all load instructions, including those without 
the need for sign extension. In the situation where the load instruction fills the 
target register completely and no sign extension is needed, such as when the data 
is already properly aligned since it is coming from the data cache, which only 
contains aligned data, or from the ALU, which outputs aligned data, the data 
processor as configured above will execute the load instruction at least one cycle 
faster due to the elimination of the shifter operations. This will create a latency of 
at least one cycle fewer for those load instructions which bypass the sign 
extension unit, and at least one more cycle for those which need sign extension, 
such as half-word load instructions. Because the applicant's claimed load 
instructions, which take 2 and 3 cycles for bypassed and sign-extended 
instructions, respectively, have no claimed advantages over the 1 and 2 cycles that 
Greenly in view of Hannah have taught, but are merely a change in the magnitude 
of latency, they are considered to be equivalent and thus taught by Greenley in 
view of Hannah (see In re Rose, 220 F.2d 459, 463, 105 USPQ 237, 240 (CCPA 
1955)). 
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3 1 . Referring to claim 26, Greenley in view of Hannah has taught wherein the bypass circuit 
comprises a multiplexer having a first input coupled to the cache and a second input coupled to 
the shifter circuit (Hannah column 9, lines 31-67; Figure 11; Figure 12; Figure 13; and Figure 
14). 

32. Referring to claim 27, Greenley in view of Hannah has taught 

a. Shifting, sign extending, or zero extending a first data value from a cache and 
providing a modified first data value to a first of a plurality of registers (Greenley 
Col.l lines 15-21, 63-67 and CoL2 lines 1-7, 13-15, and 19-54); and 

b. Transferring a second data value from the cache to a second of the plurality of 
registers without shifting, sign extending, or zero extending the second data value 
(Greenley Col.l lines 15-21, 63-67 and Col.2 lines 1-7, 13-15, and 19-54). 

33. Referring to claim 28, Greenley in view of Hannah has taught 

a. Shifting, sign extending, or zero extending the first data value comprises shifting, 
sign extending, or zero extending the first data value in response to determining 
that a first pending instruction in a processor is a load byte operation or a load 
half-word operation (see Greenley Col.2 lines 17-20, 24-30, 46-47); and 

b. Transferring the second data value comprises transferring the second data value to 
the second register in response to determining that a second pending instruction in 
the processor is a load word operation (see Greenley Col.2 lines 17-20, 24-30, 46- 
47). 

Response to Arguments 
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34. Applicant's arguments, filed 1 1 April 2006, with respect to claims 1-22 have been fully 
considered and are persuasive. The rejection has been withdrawn in favor of the new rejection 
above. 



35. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Aimee J. Li whose telephone number is (571) 272-4169. The 
examiner can normally be reached on M-T 7:00am-4:30pm. 

36. If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Eddie Chan can be reached on (571) 272-4162. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

37. Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only, For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



Conclusion 



AJL 

Aimee J. Li 
23 June 2006 




